Application of the Gaussian Process Regression Method Based on a Combined Kernel Function in Engine Performance Prediction

At present, regression modeling methods fail to achieve higher simulation accuracy, which limits the application of simulation technology in more fields such as virtual calibration and hardware-in-the-loop real-time simulation in automotive industry. After fully considering the abruptness and complexity of engine predictions, a Gaussian process regression modeling method based on a combined kernel function is proposed and verified in this study for engine torque, emission, and temperature predictions. The comparison results with linear regression, decision tree, support vector machine (abbreviated as SVM), neural network, and other Gaussian regression methods show that the Gaussian regression method based on the combined kernel function proposed in this study can achieve higher prediction accuracy. Fitting results show that the R2 value of engine torque and exhaust gas temperature after the engine turbo (abbreviated as T4) prediction model reaches 1.00, and the R2 value of the nitrogen oxide (abbreviated as NOx) prediction model reaches 0.9999. The model generalization ability verification test results show that for a totally new world harmonized transient cycle data, the R2 value of engine torque prediction is 0.9993, the R2 value of exhaust gas temperature is 0.995, and the R2 value of NOx emission prediction result is 0.9962. The results of model generalization ability verification show that the model can achieve high prediction accuracy for performance prediction, temperature prediction, and emission prediction under steady-state and transient operating conditions.


INTRODUCTION
At present, simulation technology is showing its ability in the automotive field, and the technology has been developed from performance simulation to the application of full-life-cycle simulation of products. Simulation technology has the characteristics of visibility, verifiability, perception, and so forth. 1 It can be used to accelerate the automotive product development phase and improve system reliability; 2,3 however, the problem of low simulation accuracy limits its wide application in fields of virtual calibration and hardware-in-the-loop real-time simulation. The main reasons for the low accuracy of engine performance simulation are as follows: 1. Engine performance could abruptly change. Taking carbon monoxide emission as an example, when the exhaust gas temperature and air−fuel ratio exceed a certain limitation, carbon monoxide emission would possibly change abruptly, which brings challenges to traditional Gaussian process regression (GPR) algorithms, support vector machine (SVM), and their covariance functions to reflect the correlation between variables; 2. Engine system is a complex system involving multiple disciplines such as mechanics, thermodynamics, chem-istry, and so forth, which brings challenges to the feature extraction of the regression modeling process.
In the field of engine performance prediction, scholars have conducted long-term research. Engine modeling technology can be divided into mechanism modeling technology and regression modeling technology.
Mechanism modeling is a modeling technology based on the physical properties of each component. This technology analyzes the working process of the object and widely adopts the ideal state equation, look-up table, and other methods to establish the airflow process model and thermodynamic process model of the engine. 4−6 The advantage of the mechanism modeling method is that it helps to understand the characteristics of engine components, the interaction between components, and the effect of components on the engine's overall performance. 7 Also, the mechanism modeling method has the following disadvantages: 1. The operation process of the engine is complex, involving multiple disciplines such as mechanics, thermodynamics, chemistry, electronic control technology, and so forth, and the current research fails to clearly understand the combustion process of the engine, which brings great challenges to the mechanism modeling process; 2. The model shows low accuracy, and the calibration process is challenging. Mechanism modeling methods widely adopt approximation or idealization methods such as the ideal state equation and look-up tables for modeling, and many parameters could be obtained only through the data fitting method instead of direct experiments. This makes the model calibration process difficult and the model accuracy low.
Regression modeling is a mathematical modeling method applying statistical methods to quantitatively show the working process. 8,9 Higher prediction accuracy could be achieved with neural networks, decision trees, SVM, and so forth. Kang and Zhou 10 studied the relationship between the engine torque and cylinder pressure through the linear regression fitting method and obtained the correlation between the engine torque and cylinder pressure: P = 0.0229N + 0.9969. Zhang et al. 11 built a diesel engine emission prediction model with a three-layer BP neural network, and the result showed that the error between the model prediction result and experimental result was less than 9%. Hui and Li 12 used weighted least-squares method to establish a linear regression model for engine torque prediction. Test results showed that the model prediction error was 7.60%. Li et al. 13 built an RGF model for engine torque and fuel consumption rate prediction, and the results showed that the prediction error of engine torque under steady-state and transient conditions would be within 5%. Shahpouri et al. 14 built an engine soot emission prediction model with the regression tree (RT), ensemble of RTs, SVMs, GPR, artificial neural network, and Bayesian neural network, and results showed that the fitting R 2 value of the engine black-box model using GPR and feature selection by LASSO reached 0.96, and the fitting R 2 value of the gray-box model using SVM reached 0.97.
The above-mentioned algorithms have wide applications in the field of machine learning, and many scholars have conducted in-depth research on them. However, the application performance in the field of engine performance prediction needs to be further improved for higher simulation accuracy.
In recent years, GPR has been widely used in the field of nonlinear system modeling. In a Gaussian process, each point in a continuous input space is associated with a normally distributed random variable. A Gaussian process is a random process in which observations appear in a continuous domain.
The kernel function in Gaussian regression characterizes the correlation between variables. As part of the model assumptions, different kernel functions can achieve different fitting results. Commonly used kernel functions include the radial basis function kernel (abbreviated as RBF kernel), Matern kernel, exponential function kernel (exponential kernel), rational quadratic kernel (abbreviated as RQ kernel), periodic kernel, polynomial kernel, and so forth.
Without limiting the form of the kernel function, Gaussian regression is theoretically a universal approximator of any continuous function in a compact space. In addition, Gaussian regression can provide the posterior of the prediction result, and this posterior has an analytical form, so Gaussian regression is a general and analytic model. 15 Based on the above advantages,  people can use the Gaussian regression technology to quickly  and efficiently create models of engines, power systems, or any  other systems, and people can more conveniently adjust and  optimize calibration parameters, reduce the need for calibration  development work on the engine test bench or vehicle, so this  technology makes powertrain system development more  efficient. Although Gaussian regression has the advantages of generality and analyzability, 16−19 Gaussian regression is not flexible enough when the data in different areas changes abruptly, and a single kernel function cannot fit effectively.
Based on the above analysis, this study proposes and demonstrates the technical feasibility of the GPR algorithm based on a combined kernel function (Section 2), and a blackbox model of a 3.0 L diesel engine is established (Section 3). The engine torque, emissions, and temperature performance are predicted using the method proposed in this study (Sections 4.1 and 4.3), and the prediction accuracy of engine torque by linear regression, decision tree, SVM, neural network, GPR, and the method proposed in this study is compared using the same training dataset in Section 4.2. The generalization ability of the model is validated under transient running conditions, which is not included in the training dataset.

GPR TECHNOLOGY BASED ON A COMBINED KERNEL FUNCTION
GPR is a major data fitting method in the field of machine learning. Theoretically, this method can provide nonlinear models for any system. Although the model space is infinitely dimensional, the problem of overfitting can be prevented by empirical Bayesian methods, which provide a maximumlikelihood model given a limited set of measurement data. The model fitted by the GPR method is given as a Gaussian probability distribution for each array of input variables. From the weight-space point of view, GPR can be derived from the principle of Bayesian linear regression, that is, for a given set of N independent learning samples: X X X X , , ..., where ω is the weight coefficient and ε is the residual or noise. Bayesian linear regression is a linear parametric model, as shown in eq 2, that characterizes the nonlinear relationship between variables; a given function can be used to map X to a high-dimensional space.
where ωis the weight coefficient and εis the residual or noise. Since the mapping space Φ(X) has nothing to do with the model weight, it can be directly brought into the result of Bayesian linear regression as shown in eqs 3 and 4.
where p f X y X ( , , , ) , eq 3 can be rewritten as eq 5, that is, using GPR to predict the mean and covariance values.
The applicability of a Gaussian process is limited by its basic mathematical assumptions, namely: 1 The dataset obeys a Gaussian distribution; 2 The sample noise is homoscedastic Gaussian noise; 3 Suitable for smooth function fitting; 4 The covariance function is satisfied between different variables of the dataset. However, the above assumptions are not always met in many application scenarios. For example, when the exhaust gas temperature exceeds a limit, the emission changes abruptly, and the sample noise no longer meets the assumption of homoscedastic noise. For the prediction of mutation signals, the traditional GPR is not flexible enough, and it is difficult for a single kernel function to achieve a higher fitting accuracy. This study takes engine torque prediction based on main injection quantity as an example and analyzes the fitting effect of square exponential kernel function and rational quadratic kernel function, and verifies the technical feasibility of the GPR technique based on the combined kernel function in the application of engine performance prediction.

Squared Exponential Kernel Function.
The squared exponential kernel, also called Gaussian kernel or RBF kernel, is the function space expression of the RBF regression model with infinitely many basis functions. The squared exponential kernel function, whose expression is shown in eq 6, is widely applied in GPR and SVM where σ l is the scale of the signal feature length, which is used to describe the smoothness of the function. When σ l is small, the dynamic response performance of the fitting function is better, but it is accompanied by the risk of overshooting; when σ l is large, the resultant function tends to be smooth. σ f is the standard deviation of the signal, which is used to characterize the deviation of the fitting function from the signal mean value. When σ f 2 is small, the fitting function deviates from the signal mean value slightly. Whenσ f 2 is large, the fluctuation of the fitting function will become larger. 21 x j can be regarded as the squared Euclidean distance between two eigenvectors; as the value of the squared exponential kernel function decreases with the decrease of distance, its value is limited between 0 and 1 (when x i = x j , its value would be 1), so it is a ready-made similarity measure. The feature space of a kernel has an infinite number of dimensions.
It can be seen from eq 6 that the squared exponential kernel function is infinitely differentiable, which means that the GPR with the squared exponential kernel function as a covariance function has the mean-squared derivative of all orders; meanwhile, the squared exponential kernel function replaces the inner product of the basis function with a kernel, and the advantage of this function is that the error is relatively controllable when dealing with large datasets with high dimensions. Therefore, the squared exponential kernel function is widely suitable for the modeling of smooth and continuous datasets, but it performs poorly when there are many training samples or when the samples contain many features. 22,23 2.2. Rational Quadratic Kernel. The expression of rational quadratic kernel is shown in eq 7.
where σ l is the scale of the signal feature length, α is a positivevalued scale-mixture parameter (α is a positive-valued scalemixture parameter), and r is the Euclidean distance between x i and x j , which is defined in eq 8.
The rational quadratic kernel is a linear superposition of infinite square exponential kernel functions. When α → ∞, the rational quadratic kernel is equivalent to the square exponential kernel function with l as the characteristic scale. The rational quadratic kernel has a wide scope, which could help to reduce the sensitivity of the model to smaller datasets and improve the generalization ability and dynamic response performance. 24 2.3. Combined Kernel Function. Based on the above analysis, as shown in eq 9, this study intends to construct a new kernel function based on square exponential kernel and rational quadratic kernel, which not only takes advantage of square exponential kernel function for modeling with high-dimensional datasets but also the dynamic response performance of fitting results could be improved by the rational quadratic kernel function.
where α is the weighted coefficient of the rational quadratic kernel function in the combined kernel function. Based on the above analysis, to further verify the fitting performance of the square exponential kernel function, the rational quadratic kernel function, and the combined kernel function, this paper selects the test data of a 3.0 L diesel engine under transient working conditions for verification. There are 60 sample points in total; each point contains two variables: engine main injection quantity and engine torque. The basic information of the engine is shown in Table 1, and the dataset information is shown in Figure 1. It can be seen from Figure 1 that the dataset contains both a relatively smooth stable operation stage and a signal mutation process.
With the same dataset, different kernel functions are used for engine torque prediction. As shown in eqs 10−13, the rootmean-square error (RMSE), R 2 (goodness of fit), mean square error (MSE), and mean absolute error (MAE) of engine torque deviation value is calculated by comparing the predicted value and the true value to evaluate the fitting performance of different kernel functions.
A reserved crossover method is used in the model training process; the training results of the GPR models using the square exponential kernel function, rational quadratic kernel function, and combined kernel function are shown in Figure 2 and Table  2. As shown in Figure 3, the comparison chart between predicted results and the true value is used in this paper to illustrate the fitting performance of the model at different sample points. The predicted results of the model should theoretically be close enough to the true value, that is, all operating points should be located on the diagonal line, the distance between each operating point and the diagonal line means the prediction error of the point, and the prediction error of a good model should be as small as possible. The prediction results show that, compared with GPR with the square exponential kernel function, the GPR model with the rational quadratic kernel function could achieve a higher R 2 value (R 2 = 0.99) and lower RMSE value (7.9321), MSE value (62.919), and MAE value (3.2494). However, the GPR with the combined kernel function has a R 2 value of 1.00, the RMSE value is reduced to 3.262, and the MSE value and MAE value of the combined kernel function are also lowered.

CONSTRUCTION OF ENGINE BLACK BOX MODEL
Engine operating conditions change rapidly and are influenced by many factors. As is shown in Figure 4, the operating data of the engine under steady-state DoE test conditions are taken as sample data 25 for the construction of an engine black box model. The main influencing factors of engine torque, exhaust gas temperature after turbo (shown as T4 in Figure 4), and NOx raw emission (shown as NOx in Figure 4) are taken into consideration. The research points covered by the dataset are shown in Figure 5.

GPR-BASED ENGINE MODEL TRAINING
In this study, the operating data under the DoE test condition is used as the training dataset, and the combined kernel function is used for the fitting of the engine black box system. Version information of the main tools used is shown in Table 3. The computer used is a mobile workstation equipped with an 8-core/ 16-thread processor and an NVIDIA Quadro T600 discrete graphics card, and GPU parallel computing is used to accelerate the training process.

Training of Engine Torque Model.
The GPR-based model training process is mainly composed of two parts: hyperparameter optimization and data fitting.
As shown in eq 9, the kernel function of the regression model used in this study is weighted by the square exponential kernel function and rational quadratic kernel function. After further sorting, the combined kernel function can be expressed as eq 15.
where θ 1 is the standard deviation of the signal in the square exponential kernel function, θ 2 is the scale of the signal feature length in the square exponential kernel function, θ 3 is the standard deviation of the signal in the rational quadratic kernel function, θ 4 is the length of the signal feature in the rational quadratic kernel function, θ 5 is the scale mixing parameter of the rational quadratic kernel function, and θ 6 is the weight coefficient of the rational quadratic kernel function in the combined kernel function. It can be seen from eq 15 that there are six hyperparameters: θ 1 −θ 6 . The optimization process of hyperparameters is the process of finding the optimal solution of θ 1 −θ 6 . The algorithm is designed to find hyperparameters that minimize fivefold crossvalidation loss by using automatic hyperparameter optimization.
As shown in Table 4 and Figure 7, after 30 iterations, the observed best objective function value is 1.6747, and the standard deviation of the dataset (shown as sigma in the table) is 0.00010001. The obtained hyperparameter optimal solution is shown in Table 5.
In this study, the norm value from functional analysis theory is used to measure the discrete degree of dataset in the vector space. The L2 norm value, also known as Euclidean norm, is defined as the distance between all elements in the vector and the origin point, the calculation formula is shown in eq 16; the infinity norm is defined as the absolute value of the largest element in the vector, and its calculation formula is shown in eq 17. The L2 norm and infinite norm characterize the degree of dispersion between sample data and fitting results.
The fitting results are shown in Table 6, Figure 8 and Table 7. The results show that the infinite norm of the final gradient is 37.96 (shown as norm grad in the table), the L2 norm at the final step is 0.1074 (shown as the norm step in the table), the relative infinite norm of the final gradient is 0.008030, the degree of dispersion between the predicted value and the actual value of engine torque is small, R 2 reaches 1.00, RMSE is 1.7381, MSE is 3.0211, and MAE is 1.0077.

Comparison with Other Commonly Used Fitting Methods for Engine Torque Prediction.
In recent years, with the continuous in-depth exploration of machine learning technology, researchers have proposed and verified many prediction techniques, such as linear regression, decision tree, SVM, GPR, neural network, and so forth. These prediction methods have a wide range of applications in the field of deep learning. However, for engine performance prediction, the performance of different prediction methods varies widely.
The same training dataset used in this study is used for prediction comparison of engine torque performance using different data fitting methods included in the officially released Regression Learner APP from MathWorks; the fitting result is shown in Table 8.
Comparison results show that 1. For linear regression fitting methods, compared to linear (RMSE = 11.34) and robust linear regression (RMSE = 11.786), interaction linear can achieve a lower RMSE value (RMSE = 7.7276) because interaction linear regression adds interaction terms to the regression model, and this is helpful to explore relationships between variables; 2. Bagged tree achieves the lowest RMSE value (RMSE = 5.149), except for the method proposed in this study. Unlike other decision tree algorithms, bagged tree uses many trees for data fitting, and this could help to leverage the insight of many models; 3. SVM is a linear classifier that performs binary classification of data in a supervised learning manner. SVM performs well in classification problems but performs poorly in engine torque prediction. 4. The neural network has the characteristics of large-scale parallel processing, distributed storage, elastic topology, high redundancy, and nonlinear operation. The medium neural network achieves a relatively lower RMSE value (RMSE = 6.1125) in torque prediction. 5. The GPR algorithm based on the combined kernel function proposed in this study has the lowest RMSE value (RMSE = 1.7381).

Training of T4 and NOx Emission Models.
In this study, data modeling of T4 and NOx emissions is carried out. The modeling results are shown in Figure 9 and Tables 9 and 10.
The fitting results of T4 and NOx emissions show that the infinite norm of the final gradient is 81.05 and 95.53, the L2 norm of the final step is 0.3889 and 5.844 × 10 −3 , and the relative infinite norm of the final gradient is 9.488 × 10 −3 and 8.092 ×    The model verification results are shown in Figure 10. Under transient conditions, the errors of engine torque, T4, and NOx emission results are small. The R 2 value of engine torque prediction result is 0.9993, the R 2 value of T4 prediction is 0.995, and the R 2 value of NOx emission prediction is 0.9962. The results show that GPR technique based on the combined kernel function adopted in this study could be applied for engine performance prediction (shown as torque prediction in this study), temperature prediction (shown as T4 temperature prediction in this study), and emission prediction (shown as NOx prediction in this study).

CONCLUSIONS
In this study, we explore the application of GPR technology based on a combined kernel function in the fields of engine torque prediction, temperature prediction, and emission prediction. The above analyses lead to the following conclusions: 1. Compared with the square exponential kernel function and rational quadratic kernel function, the combined kernel function constructed in this study could not only have the advantage of square exponential kernel function in modeling with high-dimensional samples but also improve the dynamic response performance through the rational quadratic kernel function; 2. The comparison results with linear regression, decision tree, SVM, neural network, and Gaussian regression show that GPR technique based on the combined kernel function proposed in this study could achieve higher prediction accuracy in the fields of engine torque prediction, emission prediction (NOx emission prediction), and exhaust temperature prediction (T4 temperature prediction). The R 2 values of engine torque prediction and T4 prediction reach 1.00, and the R 2 value of NOx prediction model reaches 0.9999; 3. The generalization ability verification results of the prediction model show that for the new data the model has not seen during the training process, the R 2 value of engine torque calculation result is 0.9993, the R 2 value of T4 is 0.995, and the R 2 value of NOx emission result is 0.9962, results show that for the data not included in the training dataset, the model can still achieve high prediction accuracy; 4. The Gaussian regression technique based on the combined kernel function proposed in this study is suitable for both engine prediction under steady-state   operating conditions (as shown by the model training results) and engine prediction under transient conditions (as shown in the model's generalized verification test).
As mentioned above, the GPR algorithm based on combined kernel function proposed in this study can effectively improve engine performance simulation accuracy, and further research can be carried out in the fields of engine/vehicle virtual calibration, DoE design, and hardware-in-the-loop real-time simulation.